Goto

Collaborating Authors

 localization method







NeRF-IBVS: Visual Servo Based on NeRF for Visual Localization and Navigation

Neural Information Processing Systems

Visual localization is a fundamental task in computer vision and robotics. Training existing visual localization methods requires a large number of posed images to generalize to novel views, while state-of-the-art methods generally require dense ground truth 3D labels for supervision. However, acquiring a large number of posed images and dense 3D labels in the real world is challenging and costly. In this paper, we present a novel visual localization method that achieves accurate localization while using only a few posed images compared to other localization methods. To achieve this, we first use a few posed images with coarse pseudo-3D labels provided by NeRF to train a coordinate regression network.



MG-HGNN: A Heterogeneous GNN Framework for Indoor Wi-Fi Fingerprint-Based Localization

Wang, Yibu, Zhang, Zhaoxin, Li, Ning, Zhao, Xinlong, Zhao, Dong, Zhao, Tianzi

arXiv.org Artificial Intelligence

Abstract--Received signal strength indicator (RSSI) is the primary representation of Wi-Fi fingerprints and serves as a crucial tool for indoor localization. However, existing RSSI-based positioning methods often suffer from reduced accuracy due to environmental complexity and challenges in processing multi-source information. T o address these issues, we propose a novel multi-graph heterogeneous GNN framework (MG-HGNN) to enhance spatial awareness and improve positioning performance. In this framework, two graph construction branches perform node and edge embedding, respectively, to generate informative graphs. Subsequently, a heterogeneous graph neural network is employed for graph representation learning, enabling accurate positioning. The MG-HGNN framework introduces the following key innovations: 1) multi-type task-directed graph construction that combines label estimation and feature encoding for richer graph information; 2) a heterogeneous GNN structure that enhances the performance of conventional GNN models. Evaluations on the UJIIndoorLoc and UTSIndoorLoc public datasets demonstrate that MG-HGNN not only achieves superior performance compared to several state-of-the-art methods, but also provides a novel perspective for enhancing GNN-based localization methods. Ablation studies further confirm the rationality and effectiveness of the proposed framework. Index T erms--Fingerprint-based localization, graph neural network, heterogeneous network, received signal strength indicator (RSSI). NDOOR localization technologies aim to estimate the position of mobile users or devices in indoor environments where satellite-based systems such as GPS are ineffective [1]. Over the past decade, a variety of wireless indoor localization techniques have been developed based on different sensing modalities, including Bluetooth Low Energy (BLE) [2], Ultra Wideband (UWB) [3], Radio Frequency Identification (RFID) [4], magnetic field sensing [5], and Wi-Fi [6], [7]. Among them, Wi-Fi based localization has attracted a lot of attention due to the ubiquity of Wi-Fi infrastructure, low deployment cost, and compatibility with existing mobile devices without requiring additional hardware [1]. This work has been submitted to the IEEE for possible publication. This work is supported by the National Key Research and Development Program of China [Grant No. 2024QY1103], the Shandong Provincial Natural Science Foundation, China [Grant No. ZR2024QF138].(Corresponding Yibu Wang, Zhaoxin Zhang, Ning Li, and Tianzi Zhao are with the School of Computer Science and Technology, Harbin Institute of Technology, China (e-mail: 24b903081@stu.hit.edu.cn; Xinlong Zhao is with the China Mineral Resources Group Big Data Co., Ltd, China (e-mail: xinlong.zhao@qq.com).


UniAIDet: A Unified and Universal Benchmark for AI-Generated Image Content Detection and Localization

Zhang, Huixuan, Wan, Xiaojun

arXiv.org Artificial Intelligence

With the rapid proliferation of image generative models, the authenticity of digital images has become a significant concern. While existing studies have proposed various methods for detecting AI-generated content, current benchmarks are limited in their coverage of diverse generative models and image categories, often overlooking end-to-end image editing and artistic images. To address these limitations, we introduce UniAIDet, a unified and comprehensive benchmark that includes both photographic and artistic images. UniAIDet covers a wide range of generative models, including text-to-image, image-to-image, image inpainting, image editing, and deepfake models. Using UniAIDet, we conduct a comprehensive evaluation of various detection methods and answer three key research questions regarding generalization capability and the relation between detection and localization. Our benchmark and analysis provide a robust foundation for future research.


Degradation-Aware Cooperative Multi-Modal GNSS-Denied Localization Leveraging LiDAR-Based Robot Detections

Pritzl, Václav, Yu, Xianjia, Westerlund, Tomi, Štěpán, Petr, Saska, Martin

arXiv.org Artificial Intelligence

This work has been submitted to the IEEE for possible publication. Abstract--Accurate long-term localization using onboard sensors is crucial for robots operating in Global Navigation Satellite System (GNSS)-denied environments. While complementary sensors mitigate individual degradations, carrying all the available sensor types on a single robot significantly increases the size, weight, and power demands. Distributing sensors across multiple robots enhances the deployability but introduces challenges in fusing asynchronous, multi-modal data from independently moving platforms. We propose a novel adaptive multi-modal multi-robot cooperative localization approach using a factor-graph formulation to fuse asynchronous Visual-Inertial Odome-try (VIO), LiDAR-Inertial Odometry (LIO), and 3D inter-robot detections from distinct robots in a loosely-coupled fashion. The approach adapts to changing conditions, leveraging reliable data to assist robots affected by sensory degradations. A novel interpolation-based factor enables fusion of the unsynchronized measurements. LIO degradations are evaluated based on the approximate scan-matching Hessian. A novel approach of weighting odometry data proportionally to the Wasserstein distance between the consecutive VIO outputs is proposed. A theoretical analysis is provided, investigating the cooperative localization problem under various conditions, mainly in the presence of sensory degradations. The proposed method has been extensively evaluated on real-world data gathered with heterogeneous teams of an Unmanned Ground V ehicle (UGV) and Unmanned Aerial V ehicles (UA Vs), showing that the approach provides significant improvements in localization accuracy in the presence of various sensory degradations. N Global Navigation Satellite System (GNSS)-denied environments, fusing different localization modalities is crucial to provide robustness to various environmental challenges [1]. Visual-based localization requires cheap and light-weight sensors, but it is sensitive to illumination changes and texture-less environments. This work was supported by CTU grant no SGS23/177/OHK3/3T/13, by the Czech Science Foundation (GA ˇ CR) under research project No. 23-07517S, and by the European Union under the project Robotics and advanced industrial production (reg.